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1. INTRODUCTION 

Long-run behavior of interacting individuals can be often described within game-theoretic models. 
The basic notion here is that of a Nash equilibrium. This is a state of population - an assignment 
of strategies to players - such that no player, for fixed strategies of his opponents, has an incentive 
to deviate from his curent strategy; the change can only diminish his payoff. Nash equilibrium is 
supposed to be a result of decisions of rational players. John Maynard Smith (1974, 1982) has refined 
this concept of equilibrium to include the stability of Nash equilibria against mutants. He introduced 
the fundamental notion of an evolutionarily stable strategy. If everybody plays such a strategy, then 
the small number of mutants playing a different strategy is eliminated from the population. The 
dynamical interpretation of the evolutionarily stable strategy was later provided by several authors 
(Taylor and Jonker, 1978; Hofbauer et al, 1979; Zecman, 1981). They proposed a system of differ- 
ential or difference equations, the so-called replicator equations, which describe the time-evolution 
of frequencies of strategies. It is known that any evolutionarily stable strategy is an asymptotically 
stable stationary point of such dynamics (Hofbauer and Sigmund, 1988; Weibull, 1995). 

Here we will discuss a stochastic adaptation dynamics of a population of players interacting in dis- 
crete moments of time. We will analyze two-player games with two strategies and two evolutionarily 
stable strategies. The efficient strategy (also called payoff dominant) when played by the whole pop- 
ulation results in its highest possible payoff (fitness). The risk-dominant one is played by individuals 
averse to risks. The strategy is risk dominant if it has a higher expected payoff against a player playing 
both strategies with equal probabilities. We will address the problem of equilibrium selection - which 
strategy will be played in the long run with a high frequency. 

We will review two models of adaptive dynamics of a population of a fixed number of individuals. 
In both of them, the selection part of the dynamics ensures that if the mean payoff of a given strategy 
at the time t is bigger than the mean payoff of the other one, then the number of individuals playing 
the given strategy should increase in t -|- 1. In the first model, introduced by Kandori, Mailath, and 
Rob (1993), one assumes (as in the standard replicator dynamics) that individuals receive average 
payoffs with respect to all possible opponents - they play against the average strategy. In the second 
model, introduced by Robson and Vega-Redondo (1996), at any moment of time, individuals play 
only one game with randomly chosen opponents. In both models, players may mutate with a small 
probability hence the population may move against a selection pressure. To describe the long-run 
behavior of such stochastic dynamics, Foster and Young (1990) introduced a concept of stochastic 
stability. A configuration of a system is stochastically stable if it has a positive probability in the 
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stationary state of the above dynamics in the hmit of no mutations. It means that in the long run we 
observe it with a positive frequency. Kandori, Mailath, and Rob (1993) showed that in their model, 
the risk-dominant strategy is stochastically stable - if the mutation level is small enough we observe 
it in the long run with the frequency close to one. In the model of Robson and Vega-Redondo (1996), 
the efficient strategy is stochastically stable. It is one of very few models in which an efficient strategy 
is stochastically stable in the presence of a risk-dominant one. The population evolves in the long run 
to a state with the maximal fitness. 

The main goal of our paper is to investigate the effect of the number of players on the long-run 
behavior of the Robson- Vega-Redondo model. We will discuss parallel and sequential dynamics, and 
the one, where each individual enjoys each period a revision opportunity with some independent 
probability. We will show that in the last two dynamics, for any arbitrarily low but a fixed level of 
mutations, if the number of players is sufficiently big, a risk-dominant strategy is played in the long 
run with a frequency closed to one - a stochastically stable efficient strategy is observed with a very low 
frequency. It means that when the number of players increases, the population undergoes a transition 
between an efficient payoff-dominant equilibrium and a risk-dominant one. We will also show that 
for some range of payoff parameters, stochastic stability itself depends on the number of players. If 
the number of players is below certain value (which may be arbitrarily large), then a risk-dominant 
strategy is stochastically stable. Only if n is large enough, an efficient strategy becomes stochastically 
stable as proved by Robson and Vega-Redondo (1996). 

In Section 2, we introduce Kandori-Mailath-Rob and Robson- Vega-Redondo models and review 
their main properties. In Section 3, we analyze the Robson- Vega-Redondo model in the limit of the 
infinite number of players and show our main results. Discussion follows in Section 4. 

2. Models of Adaptive Dynamics with Mutations 

We will consider a finite population of n individuals who have at their disposal one of two strategies: 
A and B. At every discrete moment of time, t = 1,2, they are randomly paired (we assume that n 
is even) to play a two-player symmetric game with payoffs given by the following matrix: 

A B 

A a b 

U = 

Bed 
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where the ij entry, i,j = A,B, is the payoff of the first (row) player when he plays the strategy i 
and the second (column) player plays the strategy j. We assume that both players are the same and 
hence payoffs of the column player are given by the matrix transposed to U ; such games are called 
symmetric. 

We assume that a > c and d > b, therefore both A and B are evolutionarily stable strategies, 
and a + b < c + d, so the strategy B has a higher expected payoff against a player playing both 
strategies with the probability 1/2. We say that B risk dominates tlic strategy A (the notion of 
the risk-dominance was introduced and thoroughly studied by Harsanyi and Selten (1988)). We also 
assume that a > d hence we have a selection problem of choosing between the risk-dominant B and 
the so-called payoff-dominant or efficient strategy A. 

At every discrete moment of time t, the state of our population is described by the number of 
individuals, zt, playing A. Formally, by the state space we mean the set 

Q, = {z,Q<z<n}. 

Now we will describe the dynamics of our system. It consists of two components: selection and 
mutation. The selection mechanism ensures that if the mean payoff of a given strategy, TTi{zt),i = A,B, 
at the time t is bigger than the mean payoff of the other one, then the number of individuals playing 
the given strategy should increase in i + 1. In their paper, Kandori, Mailath, and Rob (1993) write 

^ (.^ a(^t - 1) + bjn - zt) 

T^A^zt) = ^j— , (2.1) 

czt + d{n - zt-1) 
i^B{zt) = — 



n-1 
provided < zt < n. 

It means that in every time step, players are paired infnitely many times to play the game or 
equivalently, each player plays with every other player and his payoff is the sum of corresponding 
payoffs. This model may be therefore considered as an analog of replicator dynamics for populations 
with fixed numbers of players. 

The selection dynamics is formalized in the following way: 

Zt+l > Zt if TTAizt) > TTsizt), (2.2) 

Zt+1 < Zt if TTAizt) < 1^B{zt), 

Zt+l = Zt if TTA{zt) = T^sizt), 
Zt+l = Zt if zt = or Zt = n. 
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Now mutations are added. Players may switch to new strategies with the probabihty e. It is easy to 
see that for any two states of the population there is a positive probability of the transition between 
them in some finite number of time steps. We have therefore obtained an irreducible Markov chain 
with n + 1 states. It has a unique stationary probability distribution (a stationary state) which we 
denote by /i^j. It was shown (Kandori et aL, 1993) that lim^^o /^ri(0) = 1 which means that in the 
long run, in the limit of no mutations, all players play the risk-dominant strategy B. We say that the 
risk-dominant strategy is stochastically stable. 

The general set up in the Robson-Vega-Redondo model (1996) is the same. However, individuals 
are paired only once at every time step and play only one game before selection process takes place. 
Let pt denote the random variable which describes the number of cross-pairings, i.e. the number of 
pairs of matched individuals playing different strategies at the time t. Let us notice that pt depends 
on Zt- For a given realization of pt and zt, mean payoffs obtained by each strategy are as follows: 

~ f X a{zt-pt) + bpt 

T^A{zt,Pt) = , (2.3) 

Zt 

_ cpt + d{n - Zt - Pt) 

T^B{zuPt) = , 

n- Zt 

provided < zt < n. Then the authors show that the payoff-dominant strategy is stochastically stable. 
We will outline their proof. 

First of all, one can show that there exists k such that if n is large enough and zt > k, then there is a 
positive probability (a certain realization of pt) that after a finite number of steps of the mutation-free 
selection dynamics, all players will play A. Likewise, if zt < k (for any k > 1), then if the number of 
players is large enough, then after a finite number of steps of the mutation-free selection dynamics all 
players will play B. In other words, z = and z = n are the only absorbing states of the mutation-free 
dynamics. Moreover, if n is large enough, then if zt > n — k, then the mean payoff obtained by A 
is always (for any realization of pt) bigger than the mean payoff obtained by B (in the worst case 
all B-players play with A-players). Therefore the size of the basin of attraction of the state ^; = is 
at most n — k — 1 and that of z = n is at least n — k. Observe that mutation-free dynamics is not 
deterministic {pt describes the random matching) and therefore basins of attraction may overlap. It 
follows that the system needs at least -I- 1 mutations to evolve from z = n to z = and at most 
k mutations to evolve from z = to z = n. Now using a tree representation of stationary states of 
irreducible Markov chains (Frcidlin and Wentzell, 1970 and 1984; see also Appendix B) Robson and 
Vega-Redondo finish the proof and show that the efficient strategy is stochastically stable. 



EQUILIBRIUM SELECTION IN EVOLUTIONARY GAMES WITH RANDOM MATCHING OF PLAYERS 6 

However, as outlined above, their proof requires the number of players to be sufficiently large. We 
will now show that a risk-dominant strategy is stochastically stable if the number of players is below 
certain value which can be arbitrarily big. 

If the population consists of only one 5-player and n — 1 yl-players and if c > [a(n — 2) + 6]/(n — 1), 
that is n < (2a — c — b)/{a — c), then ttb > tta- It means that one needs only one mutation to evolve 
from z = n to z = 0. It is easy to see that two mutations are necessary to evolve from z = to z = n. 
Using again the tree representation of stationary states one can prove the following theorem. 

Theorem 1. If n < , then the risk- dominant strategy B is stochastically stable in the case of 

random matching of players. 

To see stochastically stable states, we need to take the limit of no mutations. We will now examine 
the long-run behavior of the Robson-Vega-Redondo model for a fixed level of mutations in the limit 
of the infinite number of players. 

3. LoNG-RuN Behavior in the Limit of Infinitely Many Players 

We will consider three specific cases of the selection rule (2.2). 

In the parallel dynamics, everyone in the selection process chooses at the same time (all players are 
synchronized) a strategy with the bigger average payoff. It means that after mutations have taken 
place, the selection moves the population to one of the two extreme states, z = or z = n. Our 
system becomes then a two-state Markov chain with a unique stationary state /i^ (a similar model 
was discussed in (Vega-Redondo, 1997)). We will show that for any number of players, if the mutation 
level is sufficiently small, then in the long run almost all individuals play the payoff-dominant strategy. 
The same result holds for any small mutation level if the number of players is large enough. 

Theorem 2 

In the parallel dynamics, 

lim^^(n) = 1 for every n, 

e— +0 

lim f^t{n) = 1 for every e < 1. 

n— >oo 

Proof: We are looking for a unique stationary state /x^ of a two-state Markov chain. Let us denote 
by Pon a transition probability from the state z = to z = n and by Pno from z = n to z = 0. We 
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have 

f^lin) = (3.1) 

POn + PnO 

For the transition from z = to z = n it is enough that two players mutate from B io A and then 
they are paired to play a game. It follows that 

POn > e^^T- (3-2) 
n — 1 

Transition from z = n to z = requires at least ^yn mutations (for some 7) which means that 

PnO < e^" (3.3) 

It follows from (3.1-3.3) that 



1 

1 + (n - l)eT" 

Hence /^^(n) is arbitrarily close to one if e is sufficiently small or n is sufficiently big 



Now we will analyze the other extreme case of a selection rule (2.2) - a sequential dynamics where 
in one time unit only one player can change his strategy. Although our dynamics is discrete in time, 
it captures the essential features of continuous-time models, where every player has an exponentially 

distributed waiting time to a moment of a revision opportunity. Probability that two or more players 
revise their strategies at the same time is therefore equal to zero - this is an example of a birth and 
death process. 

The number of vl-players in the population may increase by one in i + 1, if a S-player is chosen in t 
which happens with the probability {n—zt)/n. Analogously, the number of B-players in the population 
may increase by one in f + 1, if an ^-player is chosen in t which happens with the probability {zt)/n. 

The player who has a revision opportunity chooses int+l with the probability 1 — e the strategy 
with a higher average payoff in t and the other one with the probability e. 

Let r{k) = P(TTA{zt,pt) > TTB{zt,Pt)) and l{k) = P{TrA{zt,Pt) < T^B{zt,Pt))- The sequential dynam- 
ics is described by the following transition probabilities: 

if Zt = 0, then zt^i = 1 with the probability e and z^+i = with the probability 1 — e, 

if zt = n, then zt+i = n — 1 with the probability e and Zt+i = n with the probability 1 — e, 

if Zt 7^ 0, n, then zt+i = Zf + 1 with the probability 

r(A;)^^(l - e) + (1 - r(A;))^^e 
and zt+i = Zt — 1 with the probability 

l(k)-(l-e) + (l-l(k))-e. 
n n 
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In the dynamics intermediate between the parallel and sequential one, at time period, each individual 
has a revision opportunity with some probability r < 1. Each chosen player follows independently the 
same rule as in the sequential dynamics. The probability that in one period, a given player will have 
a revision opportunity should be proportional to the length of the period (which we normalized to 1 

in our models). For a fixed e and an arbitrarily large but fixed n, we consider the limit of continuous 
time, r ^ 0, and show that the limiting behavior is already obtained for a sufficiently small r, namely 
T < e/n^. 

For an interesting discussion on the importance of the order of taking different limits (r — > 0, n — > oo, 
and e ^ 0) in evolutionary models (especially in the Aspiration and Imitation model) see Samuelson 
(1997). 

In the intermediate dynamics, instead of (n — zt)/n and zt/n probabilities we have more involved 
combinatorial factors. In order to get rid of these inconvenient factors, we will enlarge the state space 
of the population. The state space J7' is the set of all configurations of players, that is all possible 
assignments of strategies to individual players. Therefore, a state zt = k'lnO, consists of ^ ^ ^ states 
in 17 . The sequential dynamics is not anymore a birth and death process on Q . However, we will be 
able to treat both dynamics in the same framework. 

We will show that for any arbitrarily low but fixed level of mutation, if the number of players is 
large enough, then in the long run only a small fraction of the population play the payoff-dominant 
strategy. Smaller the mutation level is, fewer players use the payoff-dominant strategy. 

The following two theorems are proved in Appendix C. 
Theorem 3 

In the sequential dynamics, for any 5 > and /3 > there exist e(5, (5) and n(e) such that for any 
n > n(e) 

fi'ni^ </3n)>l-6. 

Theorem 4 

In the intermediate dynamics dynamics, for any S > and /3 > there exist e{S, (3) and n{e) such that 
for any n > n{e) and r < ^ 

filiz </3n)>l-S. 
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Let us note that the above theorems concern an ensemble of configurations, not an individual one. 
In the limit of the infinite number of players, that is the infinite number of configurations, every single 
configuration has zero probability in the stationary state. It is an ensemble of configurations that 
might be stable (Mickisz, 2004a and 2004b). 

Let us now assume that at every time period, players arc matched many times. It follows from 
the results in (Kandori et al, 1993; Robson and Vega-Rcdondo, 1996) analysed in (Vega-Redondo, 
1996) that the limits of zero mutations and the infinite number of matching rounds per period do not 
commute. In the limit of the infinite number of matching rounds per period, individuals play against 
the average strategy and we obtain the Kandori-Mailath-Rob model and their conclusion follows. On 
the other hand, for any fixed number of matching rounds (the Robson- Vega-Redondo model), the limit 
of zero mutations gives us the stochastic stability of an efficient strategy. Here we investigated the 
effect of the number of players on the long-run behavior in the random matching model. We showed 
that the limit of the infinite number of players has the same effect as the limit of the infinite number 
of matching rounds. In fact, the probability that the average payoff of strategy A is bigger than 
the average payoff of strategy B converges in both limits to 1 or 0, if the fraction of the population 
playing A is respectively right to or left to the unique mixed Nash equilibrium. Both limits are 
therefore alternative ways of representing the idea of a low matching-induced noise. 



4. Conclusion 

We studied the effect of the number of players on the long-run behavior in the adaptive dynamics 
with mutations and random matching of players. We showed that in the sequential dynamics for any 
arbitrarily low but fixed level of mutation, if the number of players is large enough, then in the long 
run almost all of them play a risk-dominant strategy. The same result holds if at any period, each 
individual has a revision opportunity with some small probability. This is in contrast with the result 
of Robson and Vega-Redondo (1996) who for a fixed number of players take the limit of zero mutations 
and obtain stochastic stability of a payoff-dominant strategy. It means that when the number of players 
increases, the population undergoes a transition between an efficient payoff-dominant equilibrium and 
a risk-dominant one. Therefore, in any specific model, to describe its long-run behavior, one has to 
evaluate the number of players and the mutation level. 

Acknowledgments: I would like to thank the Polish Committee for Scientific Research for financial 
support under the grant KBN 5 P03A 025 20. 
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Appendix A. 

Random variable of cross-pairings 

We will first investigate the random variable pt which describes the number of cross-pairings in a state 
zt- Let zt = an. Let P be the probability mass function of the random variable of cross-pairings p; 
we skip the subscript t. 
Proposition 1 

P{\p-na{l-a)\> I3n) ^n^ooO (A.l) 

for any /? > 0. 

Proof: Let the number of A-players be equal to k = an. We begin by dividing all players into two 
groups. We arrange them randomly in a row and pick the first n/2 ones to be members of the first 
group. This will be players who will choose randomly their opponents. Let X denote the random 
variable counting the number of A-players in this group, X = Xi + ... + X„/2) where = 1 if the i-th 
player plays A; otherwise X^ = 0. The expected value and the variance of Xi are equal to E{Xi) = a 
and Var{Xi) = a{l — a) respectively. One then have that 

E{X) = E{Xi) + ... + E{Xr,/2) = an/2. (A.2) 



Var{X) = Var{Xi) + ... + Var{X^/2) + 2j2{E{XjXk) - E{Xj)E{Xk)) = - a)(l - ^^J^)- 

(A.3) 

From the Czebyshev inequality we get that 

P{\X - E{X)\ > An) < y^^^ ^„^oo 0. (A.4) 

(Pi raj 

for every Pi > 0. 

Now every player from the first group is randomly paired with a player from the second group. Let 
us first assume (for pedagogical reasons) that the number of A-players in the first group (and therefore 
in the second group) is exactly equal to an/2. 

Let Y be the random variable describing the number of cross-pairings for a given realization of X. 
y = -I- ... -I- Yn/2, where = 1 if the i-th player has chosen the opponent with a different strategy; 
Yi = otherwise. The expected value of Yi is equal to E{Yi) = 1 — a if = 1 and E{Yi) = a if 
Xi = 0, and Var{Yi) = q(1 — a). 

We get 

E{Y) = E{Yi) + ... + E{Y^/2) = a(l - a)n (A.5) 
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and using the formula for the variance of the sum of the random variables in (A. 3) we obtain 

VariY) = -ail - a) + + ^ (a.6) 

Now let the number of A-players in the first group be equal to (a + ai)n/2 for some ai. We get 
that 

E{Y) = na{l - a) + naj. (A.7) 
We again use the formula in (A.3) and get that 

Var{Y) = C{a, ai)O(n), (A.8) 

where C{a,ai) is some constant depending on a and ai and limn-,ooO{n)/n = 1. For any fixed 
number of A-players in the first group we use the Czebyshev inequality to get 

P{\Y - E{Y)\ > (32n) ^n^oc 0. (A.9) 

for every /?2 > 0. 

Now we set (3i = a\ in ( A.4) . Then Proposition 1 follows with (3 = Pi + (^2- 

Now for any state of the system, z = k,k ^ 0,n, we will calculate, in the limit of the infinite number 
of players, the probability, r(k), that the average payoff of A is bigger than that of B. We have 

rik) = P(^fc|)±^ > cp + d{n-k-p)^_ ^^^^^ 

Let k = an. It follows from (A. 10) that 

r{an) = P{p{ f~\ + ^) >d-a). (A.ll) 
n(l — a) an 

If (d — c)/(l — a) + (6 — a)/a > 0, then r{an) = 1 because d < a. This happens for a> {a — b)/{a — 
c + d — b) = 7i > 1/2. Let us notice that if c < d, then 71 < 1, if c > d, then 71 > 1. For a < 71, from 
(A.ll) we get 

n(d — a)(l ~ a)a . /. n 

r an = P{p < \ ^ ^ -). A.12 

[d — c)a + [0 — a){l — a) 



Now it follows from Proposition 1 that if 

(d — a)(l — a)a 



<a{l-a), (A.13) 



(d — c)a + (6 — a)(l — a) 
which holds for a < 72 = , then 

limn-^oorian) = 0. (A. 14) 

Note that 72, (1/2 < 72 < 1), is the unique mixed Nash equilibrium of the game. We have proved the 
following proposition. 

Proposition 2 

If a > 71, then r{an) = 1, 
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if 72 < a < 7i, then limn-^oof{otn) = 1, 
if a < 72, then limn-^oofip-n) = 0. 

Appendix B. 
Stationary states of irreducible Markov chains 

The following tree representation of stationary states of Markov chains was proposed by Preidlin and 
Wentzell (1970 and 1984). Let {^,P) be an irreducible Markov chain with a state space Q, and 
transition probabilities given by P : $7 x O ^ [0, 1]. It has a unique stationary probability distribution 
/X (called also a stationary state). For x G J7, an x-tree is a directed graph on such that from every 
y ^ X there is a unique path to x and there are no outcoming edges out of x. Denote by T{x) the set 
of all x-trees and let 

qix)= Yl n ^(y-y'), (B.l) 

deT{x) {y,y')ed 

where the product is with respect to all edges of d. Now one can show that 

^(x) = ^^^^ (B.2) 

for all X G fi. 

A state is an absorbing one if it attracts nearby states in the mutation-free dynamics. We assume 
that after a finite number of steps of the mutation-free dynamics we arrive at one of the absorbing 
states (there are no other recurrence classes) and stay there forever. Then it follows from the above 
tree representation that any state different from absorbing states has zero probability in the stationary 
distribution in the zero-mutation limit. Moreover, in order to study the zero-mutation limit of the 
stationary state, it is enough to consider paths between absorbing states. More precisely, we construct 
x-trees with absorbing states as vertices; the family of such x-trees is denoted by T{x). Let 

(y,y')&d 

where P{y, y') = f^axW^^^ P{w, w'), where the product is taken along any path joining y with y' and 
the maximum is taken with respect to all such paths. Now we may observe that if Zimg_>oq'm(y) / Qmix) = 
0, for any y ^ x, then x is stochastically stable. Therefore we have to compare trees with the biggest 
products in (B.3); such trees we call maximal. 
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Appendix C. 

Proof of Theorem 3 

Pick 6,(3, and e. It follows from the limiting properties of r{k) that there is n{e,S) and 1/2 < 73 < 72 
such that for all n > n(e, 6) we have that r{an) < e if a < 73. 
For any state in Q with z = k, we will prove that 

q{k) < 3eq{k - 1), l<k< 73^, (C.l) 

g(A;)<M^ILll, 73n<A;<n. (C.2) 

It follows from (C.1-C.2) that 

So<fc</3n ( ) li^) 

ti'n{^ < Pn) = ) ^ ( (C.3) 

^0<k</3n ( ^ 



> 



1 



1 + Eg^r ( I ) (36)^ + (36)73-1 ^n^^^^ n ^ ^^^,_^3„ 



if e is small enough. Smaller (3 is and 73 closer to 1/2, smaller e should be. 

To prove (C.1-C.2), with every A;-tree (1 < A; < 7371) we will associate a (A; — l)-tree. Let a; be a 
A;-tree. We reverse arrows on all edges on the unique path between k — 1 and k (all other edges we 

leave unchanged). (C.l) follows from the bound 

r{k){l - e) + {I - r{k))e 
{I - r{k - - e) + r{k - \)e 

and (C.2) from the bound 



< 3e 



r{k){l - e) + {I - r{k))e e 
(1 - r{k - 1))(1 - e) + r{k - l)e ^ 2' 



Proof of Theorem 4 

In the intermediate dynamics, the probability of moving m units to the right if r{k) < e or to 
the left if 1 — r{k) < e is not proportional to as in the sequential dynamics. Therefore to prove 
(C.1-C.2) we cannot simply reverse arrows on edges in constructing corresponding trees. 

To prove (C.l), with every fc-tree (1 < A; < 73n) we will again associate a {k — l)-tree. Let lo be 
a A;-tree. If on the unique path between k — 1 and k there are only transitions which involve only 
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one individual at any time period, then we reverse arrows on all edges on this path as in the proof 
of Theorem 3. Otherwise, let an edge j ^ I he the first edge which involves at least two players. 
If J > k — 1, then we reverse all arrows between k — 1 and j, cut the edge j I and connect k to 
k — 1. Because an edge was deleted, a correspondence between k and {k — l)-trees is not one-to-one 
anymore. If the edge j I involves m players, then there arc at most ^ ^ ^ k-trees with the same 
corresponding (A; — l)-tree. By cutting the edge we decreased a probability at least r*" times. If 
rri^ < 1/2, then the series ^m>2 ( ^ ) is bounded by r. C.l follows. 

If j < k — 1, then we cut the edges /c — 1 — > and j ^ I, connect j to a state with z = j — 1 (only one 
player changes his strategy) and A; to A; — 1. By the above procedure we decreased a probability by r. 
There are at most k-tiees with the same corresponding {k — l)-tree. If rn^ < e, then (C.l) follows. 

(C.2) can be proved in an analogous way. Now Theorem 4 follows in the same way as Theorem 3. 
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